Minimizing search errors due to delayed bigrams in real-time speech recognition systems

نویسندگان

  • Monika Woszczyna
  • Michael Finke
چکیده

When building applications from large vocabulary speech recognition systems, a certain amount of search errors due to pruning often has to be accepted in order to obtain the required speed. In this paper we tackle the problems resulting from aggressive pruning strategies as typically applied in large vocabulary systems to achieve close to real-time performance. We consider a typical scenario of a two pass viterbi search with the rst pass being organized as a phoneme (allophone) tree. For such a tree organized lexicon, there are two possiblities to use a bigram language model: either by building tree copies or by using so-called delayed bigrams. Since copying trees turns out to be too expensive for real time applications we basically refer to delayed bigrams, discuss their drastic in uence on the word accuracy and show how to alleviate the desastrous e ect of delayed bigrams under aggressive pruning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rescoring-Aware Beam Search for Reduced Search Errors in Contextual Automatic Speech Recognition

Using context in automatic speech recognition allows the recognition system to dynamically task-adapt and bring gains to a broad variety of use-cases. An important mechanism of contextinclusion is on-the-fly rescoring of hypotheses with contextual language model content available only in real-time. In systems where rescoring occurs on the lattice during its construction as part of beam search d...

متن کامل

Optimizing question answering systems by Accelerated Particle Swarm Optimization (APSO)

One of the most important research areas in natural language processing is Question Answering Systems (QASs). Existing search engines, with Google at the top, have many remarkable capabilities. But there is a basic limitation (search engines do not have deduction capability), a capability which a QAS is expected to have. In this perspective, a search engine may be viewed as a semi-mechanized QA...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...

متن کامل

Eecient Algorithms for Speech Recognition Thesis Committee

Advances in speech technology and computing power have created a surge of interest in the practical application of speech recognition. However, the most accurate speech recognition systems in the research world are still far too slow and expensive to be used in practical, large vocabulary continuous speech applications. Their main goal has been recognition accuracy, with emphasis on acoustic an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996